AITopics | mean estimation

We characterize the fundamental limits of high-dimensional mean testing under arbitrary truncation, where samples are drawn from the conditional distribution $P(\cdot \mid S)$ for an unknown truncation set $S$ that may hide up to an $\varepsilon$-fraction of the probability mass. For distributions with $p$-th directional moments of magnitude at most $ν_{P,p}$, truncation induces a bias of order $O(ν_{P,p}\varepsilon^{1-1/p})$. This bias creates a sharp information-theoretic detectability floor: when the signal $α$ falls below this threshold, the null and alternative hypotheses are indistinguishable even with infinite data. Above this floor, we prove that a simple second-order test achieving near-optimal sample complexity $n = O\!\left(\frac{\|Σ_P\|}{(α-4ν_{P,p}\varepsilon^{1-1/p})^2}\sqrt{d}\right)$. We further identify a structural escape from this finite-moment bias barrier. Under a directional median regularity assumption, truncation bias improves to linear order $O(\varepsilon)$. This reveals an intermediate regime in which estimation requires $Θ(d)$ samples for uniform recovery, while testing recovers the classical $Θ(\sqrt d)$ rate once truncation bias is eliminated. Together, our results provide a unified framework for mean testing under truncation, connecting finite-moment, sub-Gaussian, and median-regular structural regimes.

artificial intelligence, machine learning, truncation, (17 more...)

arXiv.org Machine Learning

2605.01335

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Instance-optimal Mean Estimation Under Differential Privacy

Neural Information Processing SystemsApr-27-2026, 08:47:59 GMT

Mean estimation under differential privacy is a fundamental problem, but worstcase optimal mechanisms do not offer meaningful utility guarantees in practice when the global sensitivity is very large. Instead, various heuristics have been proposed to reduce the error on real-world data that do not resemble the worst-case instance. This paper takes a principled approach, yielding a mechanism that is instance-optimal in a strong sense. In addition to its theoretical optimality, the mechanism is also simple and practical, and adapts to a variety of data characteristics without the need of parameter tuning. It easily extends to the local and shuffle model as well.

artificial intelligence, machine learning, mechanism, (15 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

67e235e7f2fa8800d8375409b566e6b6-Supplemental.pdf

Neural Information Processing SystemsApr-26-2026, 07:04:13 GMT

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.92)

Genre: Research Report (0.67)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Security & Privacy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Learning with User-Level Privacy

Neural Information Processing SystemsApr-26-2026, 07:04:09 GMT

We propose and analyze algorithms to solve a range of learning tasks under userlevel differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution (m 1 samples), providing more stringent but more realistic protection against information leaks. We show that for high-dimensional mean estimation, empirical risk minimization with smooth losses, stochastic convex optimization, and learning hypothesis classes with finite metric entropy, the privacy cost decreases as O(1/ m) as users provide more samples.

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre: Research Report (0.46)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

547b85f3fafdf30856386753dc21c4e1-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 23:04:56 GMT

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

547b85f3fafdf30856386753dc21c4e1-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 23:04:52 GMT

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Covariance-Aware Private Mean Estimation Without Private Covariance Estimation

Neural Information Processing SystemsApr-25-2026, 15:13:28 GMT

Informally, given n& d/α2 samples from such a distribution with mean µand covariance Σ, our estimators output µsuch that k µ µkΣ α, where k kΣ is the Mahalanobis distance. All previous estimators with the same guarantee either require strong a priori bounds on the covariance matrix or require Ω(d3/2) samples. Each of our estimators is based on a simple, general approach to designing differentially private mechanisms, but with novel technical steps to make the estimator private and sample-efficient. Our first estimator samples a point with approximately maximum Tukey depth using the exponential mechanism, but restricted to the set of points of large Tukey depth. Proving that this mechanism is private requires a novel analysis. Our second estimator perturbs the empirical mean of the data set with noise calibrated to the empirical covariance, without releasing the covariance itself. Its sample complexity guarantees hold more generally for subgaussian distributions, albeit with a slightly worse dependence on the privacy parameter. For both estimators, careful preprocessing of the data is required to satisfy differential privacy.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (0.69)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

182b39a4458fb4a9a8d6871a6671ff3e-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 09:03:27 GMT

artificial intelligence, machine learning, social media, (21 more...)

Neural Information Processing Systems

Country:

Europe (0.45)
North America (0.28)

Industry:

Leisure & Entertainment > Games (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Game Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

Outlier-Robust Sparse Estimation via Non-Convex Optimization

Neural Information Processing SystemsApr-25-2026, 08:29:40 GMT

We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. We develop novel and simple optimization formulations for these problems such that any approximate stationary point of the associated optimization problem yields a near-optimal solution for the underlying robust estimation task. As a corollary, we obtain that any first-order method that efficiently converges to stationarity yields an efficient algorithm for these tasks.1 The obtained algorithms are simple, practical, and succeed under broader distributional assumptions compared to prior work.

artificial intelligence, machine learning, stability condition, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre:

Overview (0.67)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

Neural Information Processing SystemsApr-25-2026, 01:35:20 GMT

We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean µ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates µwith high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability, having an additive log(1/) dependence. Our algorithm leverages the stability-based approach from the algorithmic robust statistics literature, with crucial (and necessary) adaptations required in our setting. Our analysis may be of independent interest, involving the delicate design of a (non-spectral) decomposition for positive semi-definite matrices satisfying certain sparsity properties.

artificial intelligence, machine learning, probability, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Filters

Collaborating Authors

mean estimation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Mean Testing under Truncation beyond Gaussian

Instance-optimal Mean Estimation Under Differential Privacy

67e235e7f2fa8800d8375409b566e6b6-Supplemental.pdf

Learning with User-Level Privacy

547b85f3fafdf30856386753dc21c4e1-Supplemental.pdf

547b85f3fafdf30856386753dc21c4e1-Paper.pdf

Covariance-Aware Private Mean Estimation Without Private Covariance Estimation

182b39a4458fb4a9a8d6871a6671ff3e-Paper-Conference.pdf

Outlier-Robust Sparse Estimation via Non-Convex Optimization

Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions